Parallel Dynamics and Computational Complexity of Network Growth Models 
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The parallel computational complexity or depth of growing network models is investigated. The 
networks considered are generated by preferential attachment rules where the probability of attach- 
ing a new node to an existing node is given by a power, a of the connectivity of the existing node. 
Algorithms for generating growing networks very quickly in parallel are described and studied. The 
sublinear and superlinear cases require distinct algorithms. As a result, there is a discontinuous 
transition in the parallel complexity of sampling these networks corresponding to the discontinuous 
structural transition at a = 1, where the networks become scale free. For a > 1 networks can be 
generated in constant time while for < a < 1 logarithmic parallel time is required. The results 
show that these networks have little depth and embody very little history dependence despite being 
defined by sequential growth rules. 



I. INTRODUCTION 



This paper is concerned with the complexity of net- 
works. Many features of biological, social and techno- 
logical systems can be described in terms of networks. 
Examples include gene networks, friendship networks, ci- 
tation networks, the power grid, the internet and the 
world wide web 0. Although the systems that gener- 
ate these networks are extremely complex, the networks 
themselves may or may not evidence this complexity. In 
many cases the networks generated by complex systems 
are approximately scale free. Barabasi and Albert Q 
(BA) showed that scale free networks can be generated 
by rules for network growth that embody the intuitively 
plausible idea of preferential attachment. In their model, 
the network grows by the addition of one node at a time 
and each node creates one new connection to an existing 
node. Existing nodes in the network that already have 
many connections are more likely to gain the new con- 
nection from the new node added to the network. The 
growing network model seems to incorporate a history 
dependent process, albeit simplified, into the generation 
of the network. 

One of the essential markers of complexity is a long 
history. Complex systems cannot arise instantaneously 
but require a long sequence of interactions to develop. 
Neither "complexity" nor "long history" are well-defined 
concepts but an appropriate proxy for these ideas can 
be formulated within computational complexity theory. 
Computational complexity theory is concerned with the 
resources required to solve problems. Although there are 
various resources required to solve computational prob- 
lems, here we focus on parallel time or depth. Depth is 
the number of computational steps needed by a parallel 
computer to solve a problem. In our case, the problem 
is to generate a statistically correct representation of the 
network. If the depth of the computation needed to gen- 
erate the network is large, even using the most efficient 
algorithm, we say that the network has a long history and 
cannot be generated quickly. If, on the other hand, only 
a few parallel steps are needed to generate the network. 



then it cannot be complex. 

The BA growing network model would appear to have 
substantial depth since nodes are added to the network 
one at a time and the preferential attachment rule uses 
knowledge of the existing state of the network to decide 
where each new node will attach. If the BA model cap- 
tures the mechanism for the scale free behavior found 
in real world networks then perhaps one can conclude 
that some of the complexity or history dependence of the 
social, biological or technological system that generated 
the network is embodied in the network. One of the main 
conclusions of this paper is that growing network mod- 
els do not actually embody much history dependence. 
What we show is that there is a fast parallel algorithm 
that generates BA growing networks with N nodes in 
©(log log A^) steps. 

The BA model has a linear preferential attachment 
rule. Krapivsky, Redner and Leyvraz Q introduced a 
generalization of the BA model in which the probability 
to connect to a node is proportional to a power, a of 
its number of connections. The original BA model is 
the case a = 1 while a = is a random network. The 
class of models < a < oo is analyzed in Refs. 0, 3 
and it is seen that a = 1 marks a "phase transition" 
between a "high temperature phase" for a < 1 where no 
node has an extensive number of connections and a "low 
temperature phase" for a > 1 where a single node has 
almost all connections in the large N limit. 

We show that distinct but related parallel algorithms 
are needed to efficiently simulate the a < 1 and a > 
1 regimes so that there is a discontinuous transition in 
the computational complexity of simulating the model at 
a = 1. For < a < 1 the parallel time for generating 
a network of size N scales logarithmically in N while for 
1 < a < oo there is a constant time algorithm. Exactly 
at a = 1 yet a third algorithm is most efficient with 
parallel running time that is ©(log log iV). 

A number of non-equilibrium models in statistical 
physics defined by sequential rules have been shown 
to have fast parallel dynamics. Examples include the 
Eden model, invasion percolation, the restricted solid- 
on-solid model 0, the Bak-Sneppen model and in- 
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ternal diffusion-limited aggregation ^ all of which can 
be simulated in parallel in polylogarithmic time. On the 
other hand, no polylog time algorithm is known for gen- 
erating diffusion-limited aggregation clusters and there is 
evidence that only powerlaw speed-ups are possible using 
parallelism 

Phase transitions in computational complexity have 
been the object of considerable recent study, for exam- 
ple, see Ref. Most of the attention has been fo- 
cused on NP-hard combinatorial optimization problems. 
Growing networks and many other physically motivated 
models are naturally related to problems in the lower 
class P (problems solvable in polynomial time). One of 
the purposes of this paper is to provide an example of a 
transition in computational complexity at this lower level 
of the complexity hierarchy. 

The paper is organized as follows. In the next section 
we define and describe the class of preferential attach- 
ment network growth to be studied. In Sec. IIIII we give 
a brief review of relevant features of parallel computa- 
tional complexity theory. Section IIVI presents efficient 
parallel algorithms for sampling growing network mod- 
els and related systems, Sec. IVl analyzes the efficiency of 
these algorithms and Sec. IVII presents results from nu- 
merical studies of the efficiency of one of the algorithms. 
The paper ends with a discussion. 



II. GROWING NETWORK MODELS 

In this section we describe growing network models 
with preferential attachment first considered by Barabasi 
and Albert ^ and later generalized by Krapivsky, Red- 
ner and Ley vraz [3, 0| . Consider a graph with N ordered 
nodes, each having one outgoing link, constructed by the 
addition of one node every time step so that at time t in 
the construction, node t is attached to a previous node, 
through t — 1. The probability 7r„(t) of attaching node 
t to node n < t is given by 



7r«(i) = 



Fjknjt)) 

Z{t) 



(1) 



where fc„(i) is the degree (number of connections) of n, 
at time t, is some function and Z is the normalization 
given by 



t-i 

Z{t)=Y,F{k,(t)). 



(2) 



We require that F{k) is a non-decreasing function of k. 
Notice that, in general, 7r„(t) is a function not only of 
kn{t) but also of kj{t) for all j < t because of the nor- 
malization, Z. The attachment probabilities depend on 
all the node degrees unless Z{t) is a function of t alone. 
This simpler form holds if and only if F is a linear func- 
tion, F{k) = a + bk. In the latter case, Z{t) = {a + 2b)t 

since J2)=o — 2^- 



The linear homogeneous case, F{k) = k corresponds 
to the original Barabasi-Albert model 01 and leads to 
a scale free network where the degree distribution, P(fc), 
has a power law tail, P{k) ~ k~^ . More generally, if F{k) 
is asymptotically linear, Pik)'^ k'^ where v is tunable to 
any value 2 < < oo 0, 0, 0|. The asymptotically 
linear attachment kernel is a marginal case and marks 
a "phase transition" between regimes with qualitatively 
different behavior. Consider the homogeneous models, 
F{k) = fc" studied in detail in Ref. |^. In the sublinear 
case, < a < 1 the degree distribution has a stretched 
exponential form and the node with the maximum degree 
has poly logarithmically many connections. The limiting 
case of a = is a random network where each connec- 
tion is randomly and independently chosen. There is an 
analogy between a and temperature in a thermodynamic 
system with the range < a < 1 like a high temperature 
phase. The order parameter is the maximum degree in 
the system divided by N and the order parameter van- 
ishes for < a < 1. In the superlinear or low tempera- 
ture phase, a > 1 there is a single, "gel" node that has 
almost all connections and the order parameter is unity. 
The phase transition then has a discontinuous character 
despite the fact that the a = 1 state is scale free. An- 
other indication that the transition is discontinuous is 
seen by looking at the entropy. Using the Kolmogorov- 
Chaitin definition of entropy as the minimum number of 
bits required to describe a system state |0, it is clearly 
seen that the entropy per node is positive for all a < 1 
but that for a > 1 the entropy per node vanishes since 
almost all nodes connect to the gel node and it is only 
necessary to specify the connections for those nodes that 
do not connect to the gel node. Thus, the entropy per 
node is also discontinuous at a = 1. 



III. PARALLEL COMPUTATION AND DEPTH 

Computational complexity theory is concerned with 
the scaling of computational resources needed to solve 
problems as a function of the size of the problem. An in- 
troduction to the field can be found in Ref. 13]. Here we 
focus on parallel computation and choose the standard 
parallel random access machine (PRAM) as the model of 
computation |14| . The main resources of interest are par- 
allel time or depth and number of processors. A PRAM 
consists of a number of simple processors (random access 
machines or RAMs) all connected to a global memory. 
Although a RAM is typically defined with much less com- 
putational power than a real microprocessor such as Pen- 
tium, it would not change the scaling found here to think 
of a PRAM as being composed of many microprocessors 
all connected to the same random access memory. The 
processors run synchronously and each processor runs the 
same program. Processors have an integer label so that 
different processors follow different computational paths. 
The PRAM is the most powerful model of classical, digi- 
tal computation. The number of processors and memory 
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is allowed to increase polynomially (i.e. as an arbitrary 
power) in the size of the problem to be solved. Commu- 
nication is non-local in that it is assumed that any pro- 
cessor can communicate with any memory cell in a single 
time step. Obviously, this assumption runs up against 
speed of light or hardware density limitations. Nonethe- 
less, parallel time on a PRAM quantifies a fundamental 
aspect of computation. Any problem that can be solved 
by a PRAM with H processors in parallel time T could 
also be solved by a single processor machine in a time 
W such that W < HT since the single processor could 
sequentially run through the tasks that were originally 
assigned to the H processors. The single processor time, 
W is sometimes referred to as the computational work. 
On the other hand, it is not obvious whether the work of 
a single processor can be re-organized so that it can be 
accomplished in a substantially smaller number of steps 
by many processors working independently during each 
step. 

An example of where exponential speed-up can be 
achieved through parallelism is adding N numbers. Addi- 
tion can be done by a single processor in a time that scales 
linearly in N. On a PRAM with N/2 processors addi- 
tion can be carried out in 0{logN) parallel time using a 
binary tree. For simplicity, suppose is a power of 2. 
In the first step, processor one adds the first and second 
numbers and puts the result in memory, processor two 
adds the third and fourth numbers and puts the result in 
memory and so on. After the first step is concluded there 
are N/2 numbers to add and these are again summed in 
a pairwise fashion by N/A processors. The summation is 
completed after ©(log A^) steps. Addition is said to have 
an efficient parallel algorithms in the sense that they can 
be solved in time that is a power of the logarithm of the 
problem size, here N, that is, polylog time. On the other 
hand, it is believed that there are some problems that 
can be solved in polynomial time using a single proces- 
sor but cannot be efficien tly par allelized . It is believed 
that P-complete problems have this property and 

cannot be solved in polylog time with polynomially many 
processors. 

The main concern of this paper is the complexity of 
generating networks defined by preferential attachment 
growth rules. Since these networks grow via a stochastic 
process, we envision a PRAM model equipped with regis- 
ters containing random numbers. The essential question 
that we seek to answer is the depth (number of PRAM 
steps) required to convert a set of independent random 
bits into a statistically correct network. 



IV. PARALLEL ALGORITHMS FOR GROWING 
NETWORK MODELS 

At first glance, it seems that growing networks have a 
strong history dependence. It would appear that to con- 
nect some node t appropriately one must first connect 
all nodes prior to t in order to compute the connection 



probabilities for t according to Eq. ^ Surprisingly, one 
can construct a statistically correct network using an it- 
erative parallel process that converges in far fewer than t 
steps. The strategy is to place progressively tighter lower 
bounds on the connection probabilities based on connec- 
tions made in previous parallel steps in the process. 



A. A Coin Toss with Memory 

A simple example of the general strategy is instructive. 
Consider a biased coin toss with memory such that the 
number of heads on the first t coin tosses modifies the 
probability of a head on toss t + 1. Suppose that more 
heads on previous tosses increases the probability of a 
head on the current toss according to some function f{x) 
where 7r(t) — f{x{t)) is the probability of a head on the 
t*'' coin toss and x{t) is the fraction of heads on all the 
coins tossed before t. Suppose that / is a non-decreasing 
function of its argument and that /(O) > 0. Note that 
the special case f{x) = x is a Polya Urn problem and is 
discussed in Sec. IIV Dl 

The goal is to simulate a sequence of N coin tosses. It 
would appear that we cannot decide coin t until we have 
decided all its predecessors. Nonetheless, we can proceed 
in parallel by successively improving lower bounds on the 
probability that a given coin toss is a head. Let, p^{t), 



/(O - fix'^it)) 



(3) 



be an estimated lower bound on the probability that the 
i*'' coin toss is a head on the S*'^ step of the algorithm 
where x^{t) is the fraction of tosses determined to be 
heads at the beginning of iteration S. The starting as- 
sumption is that none of the tosses have been determined, 
x^{t) — for all t, and this assumption is used to com- 
pute how many coins become heads on the first iteration. 
Thus, p^{t) — /(O) and, on the first iteration, coin t be- 
comes a head with this probability. Once a coin becomes 
a head, it stays a head while coins that are not heads re- 
main undecided. On the second iteration, we make use of 
the heads decided in the first iteration to recompute the 
fraction determined to be heads, x'^{t) and from these ob- 
tain the new bounds p'^{t) = f{x^{t)) > p^{t). For each 
coin t that is not yet determined to be a head we declare 
it a head with conditional probability p^(i) that it will 
become a head on this step given that it is not yet a head, 
p'^{t) = [p^it) —p^{t))/{l —p^{t)). Some new coins are 
declared heads and these are then used to compute x^ (t) . 
In general, if coin t is not yet determined by step S, it 
becomes a head with probability 



p'^it) -p^-'^{t) 
l-pS-^{t) ■ 



(4) 



where p^{t) is the conditional probability of coin t be- 
coming a head on step S given it was undecided up to 
step S. The expression for the conditional probability 
follows from the observation that the denominator is the 
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marginal probability of being undecided after step S* — 1 
and the numerator is the probability of becoming a head 
on step S. The algorithm stops on step T when there is 



no change from one step to the next, x {t) 



-\t) for 



all t, and the lower bounds equal the true probabilities 
p^{t) = Tr{t). At the end of the simulation, every coin 
that is not a head is declared to be a tail. For every t, 



so that 



x\t) < x'^it) < . . . < x'^{t) = x{t) (5) 



p'{t)<p^{t)<...<p^{t) = 7r{t). (6) 



Thus the procedure is well-defined and we can decide in 
stages whether coin t will be a head. 

In the following two sections we show how to gener- 
alize this strategy to the case of preferential attachment 
growing network models. 



B. Parallel Algorithm for Linear and Sublinear 
Kernels 



This section describes a parallel algorithm for con- 
structing a network with a sublinear or linear attach- 
ment rule, F(k) = k" where < a < 1 or, more gen- 
erally, the case where the attachment weight F{k) is a 
non-decreasing, convex function of k. As in the coin toss 
example, on intermediate parallel steps we have nodes 
whose connections are not yet determined. In this al- 
gorithm we lump all of these connections into a "ghost" 
node whose in-degree is equal to the number of nodes 
that have not yet been determined. On every parallel 
time step, S, the algorithm attempts to connect every 
node that is currently connected to the ghost node to a 
real node according to lower bounds on the connection 
probabilities determined by connections that have been 
made in previous steps. 

In the initialization, S ^ step of the algorithm, a 
ghost node is created and all real nodes are connected to 
it, except node zero, which connects to itself, and node 
one, which also connects to node zero. Thus, for 5 = 
and every sequential time t > 1, every real node n < t has 
in-degree and out-degree 1, except the zero node which 
has both in- and out-degree equal to 1. The ghost node 
has in-degree i — 1 for i > 0. Let kg{t) be the number of 
nodes connecting to the ghost node at the beginning of 
parallel step S and sequential time t so that kg{t) = t—1 
for t > 0. In the first, S = 1 step of the algorithm the 
connection probability lower bound for node t to connect 
to node n, p]^{t) is given by 



F{2)/Z^{t) n = Q 
F{l)/Z\t) n>0 



(7) 



while the connection probability, Q^{t) for the ghost node 
is taken to be proportional to its number of connections. 



with the normalization given by 

Z\t)^c{t-l) + F{2) + {t^l)F{l). (9) 

The constant c is discussed below. These are the connec- 
tion probabilities that would arise if each real node has 
one connection and the ghost node has an attachment 
probability proportional to its degree. On the first step 
of the algorithm, each node t is connected to one of its 
predecessors or the ghost node according to the proba- 
bilities given above. 

As in the case of the coin toss model described in the 
previous section, on successive steps we recompute the 
bounds on the connection probabilities (t) for the real 
nodes and the ghost node Q^{t). For general S, t and n 
these probabilities are given by 



p^Jt) = F{t^{t))/Z'{t) 

g^(i) = ck^{t)/z''{t) 

with the normalization given by 



t-i 

Z^{t) = ck'^{t) + J2F{ki{t))- 

m=0 



(10) 

(11) 



(12) 



On step S of the algorithm, the conditional probability, 
Pn{t) of connecting node t to node n, given that node t 
has not yet connected to a real node on an earlier step, 
is given by the difference between the probability bounds 
on successive steps divided by the marginal probability of 
being undetermined (connected to the ghost node) before 
step S, 



p?At)~ptHt) 



Note that the denominator can be written as 



1 



(13) 



(14) 



On step S of the algorithm each node t that was still 
connected to the ghost node after step S* — 1 is connected 
with probability p^{t) to real node n < t ot, with prob- 
ability, p^{t), 



QHt)/Q'-'{t) 



(15) 



still connected to the ghost node. The algorithm is fin- 
ished after T steps when there are no more nodes con- 
nected to the ghost node and the bounds of Eq. ^| sat- 
urate to the correct probabilities of Eq. ^ Note that at 
least one node must connect in each parallel step since 
the lowest numbered node that is still unconnected will 
have no weight allotted to it in the ghost node. 

For the conditional probabilities Pn{t) to be positive, 
the probability bounds must be non-decreasing for all n 
and t. 



Q\t)^c{t-1)/Z\t) 



(8) 



Plit) < Pl{t) <---<pl{t) 



(16) 
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These inequahties imply a bound on c as follows. Since 
F{k) is a non-decreasing function of k and k^{t) is a 
non-decreasing of S it is sufficient to require that Z{t) is 
a non-increasing function, Z^{t) < Z^^^{t), or, since. 



fc,^(i) = 2t-l- Vfcf,(i) 



m— 



(17) 



we require that 



t-1 

E 

m=0 



{F{ki{t)) - cki{t)) < J2 iFiki-'{t)) - cki-\t)). 



m=0 



(18) 

This inequality is satisfied term by term if F{k) — ck is 
non-increasing which holds if 



c > max{i^(/c + 1) - F{k)}. 

k 



(19) 



Since the algorithm will finish fastest if the ghost node 
has the smallest possible weight, we set c equal to its 
lower bound. In particular, for the power law case, 
F{k) = k" with a < 1, the maximum occurs for fc = 1 
yielding 



c = 2" - 1 



(20) 



C. The Parallel Algorithm for Superlinear Kernels 

For the superlinear gel node develops to which 

almost all nodes connect as iV — > oo. When a > 2 all but 
a finite number of nodes connect to the gel node. The 
parallel algorithm described here takes advantage of the 
fact that the vast majority of connections are to the gel 
node and the gel node plays a role similar to that of the 
ghost node in the sublinear and linear cases. The basic 
structure of the algorithm is as follows. In the initializa- 
tion S = phase the sequential algorithm is run so that 
all nodes t < to are properly connected, to is chosen so 
that a single gel node is firmly established by the time 
all nodes t < to are connected. The gel node is firmly 
established if the probability that a different node ulti- 
mately becomes the gel node is less than some small value 
e. When a is large to is small, but as a approaches 1, 
for fixed e, tg diverges. After the initialization phase, 
it is tentatively assumed that all nodes to < t < N 
are connected to the gel node. The gel node serves as 
a repository for all connections that are not yet deter- 
mined. In successive steps, the connection probabilities 
of all nodes to < t < N are modified according to the 
number of connections that possible destination nodes, 
n < t received in the previous step and lower bounds 
on connection probabilities are recalculated. The differ- 
ence between old and new probability bounds are used 
to find conditional probabilities for moving a connection 
from the gel node to some other node. This process is 
repeated until no connections are moved from the gel 
node to any other node. The nodes that have not been 



moved away from the gel node are then determined to be 
connected to the gel node. 

Following the general strategy, lower bounds on the 
connection probabilities for t > to are determined for 
each parallel step, 



ZS(t) 

where the normalization is given by 

t-1 

Z^(t) = ^F(fcf(t)). 

ra=0 



(21) 



(22) 



Note that the connection probabilities are calculated in 
the same way for the gel node and the other nodes in 
contrast to the sublinear case. 

In the first parallel step, 5' = 1, the algorithm con- 
nects every node, t > to to some node, n according to 
the connection probabilities Pn{t). In successive, steps, 
5 > 0, it attempts to re-connect only those nodes t > to 
that are still connected to the gel node. The conditional 
probability (i) for connecting t > to to n ^ g on step 
S is given by 



(23) 



The numerator is the probability that t connects to n on 
step S and the denominator is the probability that t is 
undetermined after step S — 1. The conditional proba- 
bility that t is undetermined after step 5, given that it 
was undetermined after step S* — 1, is 



(24) 



The algorithm is finished after step T if no changes oc- 
cur from step T — 1 to step T. On step T nodes that 
are connected to g are considered to be determined and 
actually connected to the gel node. 

The algorithm is valid if Eq. ^] holds for all t > to 
and n ^ g. Since F(k) is non-decreasing, we require that 
Z^{t) is a non-increasing function of S. From Eg. 1221 we 
must show that the change in Z from one parallel step 
to the next is either constant or decreasing for all t and 
S. We can write the requirement for the validity of the 
algorithm as 

t-1 

ZS _ z^-i = [^(^™) - PC^'n-')] < 0. (25) 

m=0 

It is useful to take the gel node term out of the sum, as 
its behavior is different 

t-1 

Z^-Z'-'^F{k';)~F{k^^-')+ [F{kiyF{kf-% 

(26) 
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At each parallel step connections are switched from the 
gel node to other nodes. For every connection that is 
lost by the gel node exactly one connection is gained by 
another node. We also note that because is a concave 
function with a continuously increasing derivative, we can 
say that for any positive Sk 

F'ik + 6k)5k > F{k + Sk) - F{k) > F'{k)5k (27) 

Since kg is decreasing with S and k^_^g is increasing with 
S we can rewrite Eq. 1261 absorbing the contribution of 
the gel node into the sum to describe the entire rewiring 
of k^ — connections from the gel node to another 

node. We use eq.|^to put an upper bound on the change 
in size, the RHS of Eq. [SHI 

ZS_z'-'< [ki-ki-'][F\ki)-F'{k'g)]. 

7n—0,rny^g 

(28) 

The term on the RHS in the first square brackets is non- 
negative. If kg > k^ then the term in square brackets is 
always negative because F' is a strictly increasing func- 
tion of k. This argument shows that Eq. [23 holds and 
thus the algorithm is valid if the gel node remains the 
largest node until the end of the simulation. The value 
of e and, thus, the choice of determines the error rate 
of the algorithm since the algorithm fails if and only if the 
gel node loses it status as having the most connections. 

D. Redirection Method for Linear Kernels and 
Urns 

This section explores the method proposed by 
Krapivsky and RednerQ for the case of a linear attach- 
ment kernel. We show that this method can be used to 
generate the network in C(loglog A^) steps. The method 
works as follows: At sequential time t, node t is con- 
nected to any node n < t with equal probability. With 
probability r, however, this node is redirected to the "an- 
cestor" of n, the node that n connects to. As Krapivsky 
and Redner show, when r = 0.5, this procedure exactly 
reproduces the BA model {F{k) = k). For other values 
of r, F{k) is asymptotically linear and the connectivity 
distribution scales as Pk ^ k~^ where v = 1 + 1/r. It is 
easy to see why redirection is equivalent to a linear ker- 
nel. A node that already has k connections has k ways to 
be connected from a new node since each of the k connec- 
tions can serve as a redirection point for the new node. 
For r = (1 - r) = 1/2 it is clear that F{k) = fcF(l) so 
this case corresponds to homogeneous BA network. 

This redirection process can be simulated in 
©(log log A^) parallel time as follows. First, randomly 
connect every node to one of its predecessors. Once this 
is done, for every connection, with probability r, make 
that connection a redirectable (R) connection, otherwise, 
make it a terminal (T) connection. All that remains is to 



trace every path of R connections until a T connection 
is reached. This can be accomplished using a standard 
parallel connectivity algorithm or by the following simple 
approach. For every node, t, if its outgoing connection is 
type T make no change to the connection. If its outgo- 
ing connection is type R, then it is redirected. Suppose 
t connects to i by an R connection and that i connects 
to j', then after the parallel step, t connects to j. Fur- 
thermore, if the i to j connection is type T then the new 
connection from t to j is type T, otherwise it is an R 
connection. When all of the connections are type T, the 
program is done and the network is correctly wired. It is 
clear that this procedure require a number of steps that 
scales as the logarithm of the longest chain of redirec- 
tions. On average, the longest chain of redirections will 
behave as the logarithm of the system size. Each con- 
nection redirects with probability r. The average length 
of the longest chain of redirections, M , is estimated by 
]\[j.M ~ \ where N is the number of possible starting 
points and r*^ is the probability of a chain of length M . 
Thus logN + Mlogr « so M ~ -logA^/logr. Note 
that the chain length saturates at ©(log A^) rather than 
diverges as r — > 1. Even if r — > 1 each connection will 
typically halve the distance to the origin so that there 
are ©(logA^) connections in the longest chain. A chain 
of connections of length M, can be traced in log AI steps, 
because each step will halve the length of the chain. Thus 
the algorithm will finish in OilogM) = O(loglogA^) 
steps. 



The Polya urn model is closely related to the BA grow- 
ing network model and the redirection method can be ap- 
plied to efficiently sample its histories in parallel. In the 
simplest version of the model, an urn initially contains 
two balls, one red and one black. On each time step a 
ball is randomly selected from the urn and then replaced 
along with a new ball of the same color. Thus, after N 
steps the urn contains A^ -I- 2 balls. The urn model has 
the unusual property that it can have any limit law. For 
large A^ the fraction of red balls approaches a constant 
but, with equal probability, that constant can be any 
value from zero to one. The limit law is thus determined 
by the initial choices. 



The urn model can be viewed as a network where each 
ball is a node and the connection from one node to a 
predecessor represents the fact that the color of the later 
node was determined by the earlier node. To find the 
color of a given ball or node, the connections are traced 
back to one of the two initial balls. This representation 
shows that the urn model is identical to the linear net- 
work model in the limit that the redirection probability 
is unity. The typical longest path of connections back 
to the origin is ©(log A^) since each connection will typi- 
cally halve the distance to the origin. Thus the depth of 
sampling the history of an urn model is ©(log log A^). 
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V. EFFICIENCY OF PARALLEL ALGORITHMS 
A. Efficiency of the Parallel Algorithm when 

< Q < 1 

In this section we argue that for a system of size N, 
when < a < 1, the parallel algorithm will finish in 
0{logN) parallel steps and we estimate the prefactor of 
the logarithm. The starting point is an equation for the 
expected number of connections to the ghost node on the 
S' + 1 step given the number of connections on steps S 
and S — 1, 



t-i 



l)]p'it')- (29) 



The quantity in the square brackets is 1 if t' is connected 
to the ghost node on step S and otherwise while Pg {t'), 
defined in Eq. E| is the conditional probability that t' 
connected to the ghost node after step S if it is connected 
to the ghost node before step S. Equation 1291 holds for 
5 > 2. Initially, kg{t) = t — I. Specializing to the case 
that the attachment kernel is a pure power law with ex- 
ponent a and ignoring constants that are irrelevant in 
the large t limit we have 



E(fc2(t))^c/(c+l) = l-2- 



(30) 



This result follows from the fact that the probability that 
node t will still be connected to the ghost node after the 
first step is, according to Eqs. |H1 and ^ approximately 
c/(c + 1). The far RHS of the expression is obtained 
from Eq. EOI 

To proceed further we make two approximations. 
First, we ignore fluctuations and replace kg by its av- 
erage value on the RHS of Eq. 



t-i 

E 

t'=i 



[Kit) kgit l)J^5-l(,,) 



(31) 



where the notation is simplified in this equation by inter- 
preting kg as the average number of connections to the 
ghost node and where Eas. 1111 andll5l have been used to 
expand pg. 

For the case of a linear attachment kernel, c = 1 and 
the normalization is independent of S. The ratio of 
normalizations thus drops out of the equation and we 
obtain. 



t'=l 



it') 



(32) 



For sublinear kernels, the choice of c insures that the ratio 
Z^^^{t') I Z^(t') is less than one as discussed at the end of 
Sec. IIVBI Our second approximation, is to assume that 
this ratio is unity for the entire sublinear regime. Note 
that both k'^gii) and fcg(i) are proportional to t. It follows 
from Eq.|221that fc^ {t) is is proportional to t for all S and 



we write fc|'(t) = K.{S)t. This substitution reduces Eq.gS 
to 



(33) 



Given our approximations, the ratio k{S) / k{S — 1) = 1 — 
2-" for all S and the solution is fi{S) = (1 - 2-")'5. The 
estimate for the number of steps, T needed to complete 
the algorithm is such that the ghost node is expected to 
have fewer than one node, k{T)N = 1. This requirement 
leads to the result 



T = 



log(iV) 



log(l 



(34) 



This result is compared to the numerical simulations in 
Sec. EH 



B. Efficiency of the parallel algorithm when a > 1 

In this section we show that the a > 1 algorithm fin- 
ishes in constant time independent of N although this 
constant diverges as a — > 1. The key fact Q about su- 
perlinear networks is that there is a cut-off 



fcmax = a/(a - 1) 



(35) 



such that only a finite number of nodes have more than 
fcmax connections. By choosing to suflficiently large, no 
nodes t > to will have more than fcmax connections. We 
will show that the running time of the parallel part of 
the algorithm is roughly /cmax steps. 

Consider what happens on the first step of the algo- 
rithm. All nodes t > to are initially connected to the 
gel node so the leading behavior of the normalization is 
Z^{t) ~ t" and the leading behavior of the connection 
probabilities, defined in Eq. |^ is 



(36) 



for n ^ g. Summing over &\\ n ^ g we find that on 
the first step, the probability that node t will connect 
away from the gel node behaves as The expected 

change in the number of nodes connecting to the gel node 
during step one is obtained by summing over all nodes 
to < t < N , with the result 



Sk^JN)^kliN) 



kl{N) 



(37) 



If a > 2 no changes are expected to occur in the first 
step of the algorithm and we are done. This result is 
consistent with the fact that for a > 2, there are only 
a finite number of nodes with more than one connection 
and these are all determined before to- 

For 1 < a < 2 additional steps are needed before 
5kg [N) is less than one and the algorithm is done. We 
make the ansatz that 



Sk^it) 



(38) 
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and look for a self-consistent solution for 7(6'). The run- 
ning time, T is obtained by solving for the least T such 
that 7(r) < 0. 

On the second and later steps of the algorithm, the 
conditional connection probabilities, defined in Eqs. 1211 
to 1231 can be written to leading order and for n ^ g as, 



There are two ways for p^{t) to be non-zero. The first 
is for there to have been a new connection from t' to n, 
with n < t' < t, in step S — \. The expected number 
of nodes, n that received new connections in step 5 — 1 
is just Skg{t). Since kg{t) ^ t for all 5, the leading 
behavior of Pn{t) is t~°' and the overall probability that 
t will connect away from the gel node by this mechanism 
scales as Sk^{t)t-" - r(S)-a^ 

The second way for p^{t) to be non-zero is for the ratio 
Z^^^{t)/Z^{t) to exceed unity. This possibihty apphes 
for all target nodes, n < t, n ^ g. The leading behavior 
of this ratio is given by 



{ki{t)+5ki{t)r 



' kl[t) 



(40) 

so that the leading behavior of Pn{t) is 5kg{t)t 
Since there are t target nodes, the total probability that 
t will connect away from the gel node by this mechanism 
again scales as ^'''(■5)-". 

Combining both of the above mechanisms for t to con- 
nect away from the gel node and summing over all i, 
to < t < N (and still connected to the gel node) we ob- 
tain an expression for Skg~^^{N), the expected number of 
nodes directed away from the gel node on step S, 



5fc^+i(iV) ^7V^(^'+i-" 



(41) 



Using the ansatz of Eq. |2Hlwe obtain the recursion rela- 
tion. 



7(5+1) = 7(5) + l-a. 



(42) 



The recursion relation and the initial condition 7(2) 
2 — a, Eq. 1^ has the solution, 



7(5) = a- (a - 1)5. 



(43) 



The running time of the algorithm is obtained from the 
least T for which j{T) is negative. 



T = a/(a-l) 



(44) 



or, from, Eq. 1351 



T = A:„,ax. (45) 



This result can be understood in terms of the follow- 
ing sequence of events for creating connections for nodes 
beyond to- In the first parallel step almost all nodes 



with two connections are generated. In the second par- 
allel step a small fraction of these nodes develop a third 
connection and a comparable number of nodes with one 
connection get a second connection. On the third step, 
an even smaller number of nodes with three connections 
get a fourth connection and so on until nothing happens. 
Note that the analysis of the algorithm reproduces the 
results in 4j for the scaling of the number of nodes with 
2,3,... fcmax connections. 



VI. SIMULATION RESULTS FOR LINEAR AND 
SUBLINEAR KERNELS 

In Sec. IV Al we argued that the algorithm for the sublin- 
ear kernel requires logarithmic parallel time to generate 
a network and in Eq. |Mlwe estimate the coefficient of the 
logarithm. In this section we support these conclusions 
with a simulation of the parallel algorithm on a single 
processor workstation. In the simulation the work of each 
processor on the PRAM is done in sequence making sure 
not to update the database describing the network until 
a parallel step is completed. We generated 1000 networks 
for each value of a and for each system size. Values of 
alpha ranged from to 1, in increments of 0.05 and sys- 
tem sizes from 50 nodes to 12,800 nodes with each size 
separated by a factor of two. Figure ^ shows the average 
number of parallel steps vs. system size for a = 0.25, 0.5, 
0.75 and 1.0. The figure demonstrates the logarithmic 
dependance of average running time, T on system size, 
N for all values of a and the full range of system sizes 
so that that, to good approximation, T — yl(Q;)logiV. 
Figure |21 shows a plot of the coefficient A as a function 
of a. The results are plotted for < a < 1. The predic- 
tion of Eg. 1341 is shown on the same figure. Although not 
perfect, the approximation of Eq. 1341 captures the gen- 
eral trend of the data and is within a few percent of the 
numerical results for a < 0.8. The larger fluctuations 
in connectivity near a — 1 may explain why the "mean 
field" assumption underlying the theoretical curve loses 
accuracy there. The theoretical estimate does appear to 
correctly predict that A(a) approaches zero with infinite 
slope as a — > 0. 



VII. DISCUSSION 

We have examined the parallel computational com- 
plexity of generating networks obeying preferential at- 
tachment growth rules. We demonstrated that these net- 
works can be sampled in parallel time that is much less 
than the size of the network. This result is surprising be- 
cause the defining rules for generating these networks are 
sequential with nodes added to the network one at a time 
depending on the present state of the network. Nonethe- 
less, we have bounded the depth of sampling growing 
networks by exhibiting efficient parallel algorithms for 
the three cases, 0<a<l,a = l and a > 1. The aver- 
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FIG. 1: The average parallel time T to generate a network as 
a function of system size A'' for a = 0.25, 0.5, 0.75 and 1.0, 
from bottom to top, respectively. 




0^ ' ' 0.2 ' 0.4 ' 0.6 ' 0.8 ' 1 

a 

FIG. 2: The coefficient A of the leading logarithmic term in 
the running time versus a. The points are the results of the 
simulation and the solid line is the theoretical approximation, 
Eq.EH 

age parallel running time for the < a < 1 algorithm is 
logarithmic, the algorithm for the BA scale free network 
runs in ©(log log A^) time and for a > 1 the algorithm 
runs in constant time. 

Growing networks thus provide an example of a discon- 
tinuous phase transition in complexity as a function of a 
at a = 1. It is not surprising that a complexity transition 
occurs at a = 1 since this is where the structural proper- 
ties of the system also undergo a discontinuous transition 
from a high temperature (a < 1) regime where no nodes 
have a finite fraction of the connections to a low tem- 
perature (a > 1) regime where there is a single gel node 
with almost all connections. It is noteworthy that paral- 
lel time is the proper resource to observe this transition. 
The more common complexity measure of sequential time 
or computational work has no transition since it requires 
0{N) time to give an explicit description of the network 



for any a. 

Our results set upper bounds on the depth of sampling 
growing networks but we cannot rule out the existence of 
yet faster parallel algorithms. For example, if a constant 
time algorithm exists for < a < 1, it would modify the 
conclusion that there is a discontinuous complexity tran- 
sition at a = 1. There are few rigorous lower bounds in 
computational complexity theory, so, in general, conclu- 
sions concerning the depth of sampling and the existence 
of complexity transitions in statistical physics must be 
considered tentative. 

In this paper we have presented a general strategy 
for parallelizing a broad class of sequential stochastic 
processes, exemplified by the coin toss with memory. 
We have applied the general method to create algo- 
rithms that efficiently parallelize preferential attachment 
network models. The general method should be more 
broadly applicable to growing network models with more 
com plic ated rules. To give one example, Hajra and 
Sen |l6| extend the preferential attachment model to in- 
clude an aging factor {F{k) becomes F{k, t — n)) so that 
older nodes are either favored or avoided depending on a 
parameter. Our algorithm can be modified to efficiently 
handle this class of models. 

It is also instructive to examine a growing network 
model where our general method is not efficient. If a < 0, 
a case examined by Onody and deCastro 0, the gen- 
eral method can be applied but will not be efficient. The 
problem is that lower bounds on connection probabili- 
ties are typically extremely small and the algorithm will 
connect only a few nodes in each parallel step. We are 
currently investigating methods to efficiently parallelize 
a < networks. 

The fact that preferential attachment growing net- 
works have no more than logarithmic depth indicates 
that they are not particularly complex objects. On the 
other hand, very complex biological and social systems 
generate networks with similar properties. If growing 
network models accurately describe the networks gener- 
ated by these systems one most conclude that the com- 
plexity and history dependence of the systems generat- 
ing the networks are not manifest in the networks them- 
selves. An alternative possibility is that the real networks 
are themselves complex but that growing network mod- 
els lack some essential statistical properties of the real 
networks. 
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